Goto

Collaborating Authors

 task hierarchy


Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents

Hill, Brennen

arXiv.org Artificial Intelligence

Recent advances in agent development have focused on scaling model size and raw interaction data, mirroring successes in large language models. However, for complex, long-horizon multi-agent tasks such as robotic soccer, this end-to-end approach often fails due to intractable exploration spaces and sparse rewards. We propose that an effective world model for decision-making must model the world's physics and also its task semantics. A systematic review of 2024 research in low-resource multi-agent soccer reveals a clear trend towards integrating symbolic and hierarchical methods, such as Hierarchical Task Networks (HTNs) and Bayesian Strategy Networks (BSNs), with multi-agent reinforcement learning (MARL). These methods decompose complex goals into manageable subgoals, creating an intrinsic curriculum that shapes agent learning. We formalize this trend into a framework for Hierarchical Task Environments (HTEs), which are essential for bridging the gap between simple, reactive behaviors and sophisticated, strategic team play. Our framework incorporates the use of Large Language Models (LLMs) as generative world models of tasks, capable of dynamically generating this scaffolding. We argue that HTEs provide a mechanism to guide exploration, generate meaningful learning signals, and train agents to internalize hierarchical structure, enabling the development of more capable and general-purpose agents with greater sample efficiency than purely end-to-end approaches.


ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis

Chang, Yun, Fermoselle, Leonor, Ta, Duy, Bucher, Bernadette, Carlone, Luca, Wang, Jiuguang

arXiv.org Artificial Intelligence

While recent work in scene reconstruction and understanding has made strides in grounding natural language to physical 3D environments, it is still challenging to ground abstract, high-level instructions to a 3D scene. High-level instructions might not explicitly invoke semantic elements in the scene, and even the process of breaking a high-level task into a set of more concrete subtasks --a process called hierarchical task analysis -- is environment-dependent. In this work, we propose ASHiTA, the first framework that generates a task hierarchy grounded to a 3D scene graph by breaking down high-level tasks into grounded subtasks. ASHiTA alternates LLM-assisted hierarchical task analysis --to generate the task breakdown-- with task-driven 3D scene graph construction to generate a suitable representation of the environment. Our experiments show that ASHiTA performs significantly better than LLM baselines in breaking down high-level tasks into environment-dependent sub-tasks and is additionally able to achieve grounding performance comparable to state-of-the-art methods.


Bayesian Hierarchical Reinforcement Learning

Neural Information Processing Systems

We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, (ii) using both task hierarchies and Bayesian priors is better than either alone, (iii) taking advantage of the task hierarchy reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to hierarchically optimal rather than recursively optimal policies.


An Efficient Approach to Model-Based Hierarchical Reinforcement Learning

Li, Zhuoru (National University of Singapore) | Narayan, Akshay (National University of Singapore) | Leong, Tze-Yun (National University of Singapore)

AAAI Conferences

We propose a model-based approach to hierarchical reinforcement learning that exploits shared knowledge and selective execution at different levels of abstraction, to efficiently solve large, complex problems. Our framework adopts a new transition dynamics learning algorithm that identifies the common action-feature combinations of the subtasks, and evaluates the subtask execution choices through simulation. The framework is sample efficient, and tolerates uncertain and incomplete problem characterization of the subtasks. We test the framework on common benchmark problems and complex simulated robotic environments. It compares favorably against the state-of-the-art algorithms, and scales well in very large problems.


Active Learning of Hierarchical Policies from State-Action Trajectories

Hamidi, Mandana (Oregon State University) | Tadepalli, Prasad (School of Electrical Engineering and Computer Science) | Goetschalckx, Robby (Oregon State University) | Fern, Alan (Oregon State University)

AAAI Conferences

While most work on trajectory mining is applied to pre- dict movements of mobile users, in this paper we consider a more general problem of building behavior models of users from their state-action trajectories. We assume that the user behavior can be compactly modeled as a Probabilistic State-Dependent Grammar (PSDG) which represents a hierarchical policy. The key problem is that while the states and actions of the user are directly observed, his intentional structure is not. We propose to learn the user’s policy from a set of selected trajectories and intention queries at selected states in the trajectory. Our main contributions are an algorithm for learning hierarchical policies from state-action trajectories, and principled heuristics for selecting suitable trajectories and intention queries. Experiments in multiple domains show that our approach is effective and more sample-efficient than learning non-hierarchical policies.


Bayesian Hierarchical Reinforcement Learning

Cao, Feng, Ray, Soumya

Neural Information Processing Systems

We describe an approach to incorporating Bayesian priors in the maxq framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, given sensible priors, (ii) task hierarchies and Bayesian priors can be complementary sources of information, and using both sources is better than either alone, (iii) taking advantage of the structural decomposition induced by the task hierarchy significantly reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to automatic learning of hierarchically optimal rather than recursively optimal policies.


Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning

Mehta, Neville (Oregon State University) | Ray, Soumya (Case Western Reserve University) | Tadepalli, Prasad (Oregon State University) | Dietterich, Thomas (Oregon State University)

AI Magazine

A principal one among them is the existence of multiple domains that share the same underlying causal structure for actions. We describe an approach that exploits this shared causal structure to discover a hierarchical task structure in a source domain, which in turn speeds up learning of task execution knowledge in a new target domain. Our approach is theoretically justified and compares favorably to manually designed task hierarchies in learning efficiency in the target domain. We demonstrate that causally motivated task hierarchies transfer more robustly than other kinds of detailed knowledge that depend on the idiosyncrasies of the source domain and are hence less transferable.


Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning

Mehta, Neville (Oregon State University) | Ray, Soumya (Case Western Reserve University) | Tadepalli, Prasad (Oregon State University) | Dietterich, Thomas (Oregon State University)

AI Magazine

Sequential decision tasks present many opportunities for the study of transfer learning. A principal one among them is the existence of multiple domains that share the same underlying causal structure for actions. We describe an approach that exploits this shared causal structure to discover a hierarchical task structure in a source domain, which in turn speeds up learning of task execution knowledge in a new target domain. Our approach is theoretically justified and compares favorably to manually designed task hierarchies in learning efficiency in the target domain. We demonstrate that causally motivated task hierarchies transfer more robustly than other kinds of detailed knowledge that depend on the idiosyncrasies of the source domain and are hence less transferable.